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(57) Abstract: A method of blending images of segments of a view includes determining the position (56) of a second segment of 
the view represented by a second image relative to a first segment of the view represented by a first image; dividing (54) the second 
image into a first section and a second section, based on the determined positions; drawing the first image on a canvas (24); and 
drawing the first section of the second image on the canvas at the determined position so that a portion of the first section masks out 
a portion of the first image. 
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MERGING IMAGES TO FORM A PANORAMIC IMAGE 

TECHNICAL FIELD 

This invention relates to merging images to form a panoramic image. 

BACKGROUND 

5 Image capture devices, such as cameras, are used to capture an image of a section of a 

view, such as a section of the front of a house. The section of the view whose image is 
captured by a camera is known as the field of view of the camera. Adjusting a lens 
associated with a camera may increase the field of view. However, there is a limit beyond 
which the field of view of the camera cannot be increased without compromising the quality, 

10 or "resolution", of the captured image. It is sometimes necessary to capture an image of a 
view that is larger than can be captured within the field of view of a camera. Multiple 
overlapping images of segments of the view are taken and then the images are joined 
together, or "merged," to form a composite image, known as a panoramic image. 

An image captured by a camera distorts the sizes of objects depicted in the image so 

15 that distant objects appear smaller than closer objects. The size distortion, which is known as 
perspective distortion, depends on the camera position, the pointing angle of the camera, and 
so forth. Consequently, an object depicted in two different images might not have the same 
size in the two images, because of perspective distortion. 

SUMMARY 

20 In general, one aspect of the invention relates to a method of blending images of 

segments of a view. The method includes determining the position of a second segment of 
the view represented by a second image relative to a first segment of the view represented by 
a first image, dividing the second image into a first section and a second section, based on the 
determined positions, drawing the first image on a canvas, and drawing the first section of 

25 the second image on the canvas at the determined position so that a portion of the first 
section masks out a portion of the first image. 
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In general, another aspect of the invention relates to an article that includes a 
computer-readable medium, which stores computer-executable instructions for blending 
images of segments of a view according to the method described above. 

Determining the position of the segment depicted in the second image relative to the 
segment in the first allows the method to blend images that may represent segments of the 
view that are arbitrarily positioned relative to each other. It also allows the method to blend 
images that may have arbitrary shapes and sizes. The method also saves processing time by 
drawing the first image without altering it and then masking out portions of the first image 
with a section of the second image. 

Embodiments of the invention may include one or more of the following features. 
The method further includes determining a position of a third segment of the view, 
represented by a third image, relative to the first segment, dividing the third image into a 
third section and a second section, based on the determined position relative to the first 
segment, determining a position of the third segment of the view relative to the second 
image, dividing the third section into a fifth and a sixth section, based on the determined 
position relative to the second image, and drawing the fifth section of the third image on the 
canvas at the determined position relative to the third image so that a portion of the fifth 
section obstructs at least one of the first image and the first section of the second image. 
Thus the method allows a new image to be added to the blended panoramic image without 
performing any additional processing of the earlier images. The method only computes the 
section of the new image that should be drawn over the panoramic image. 

The method responds to a command to remove the third image by erasing the canvas; 
drawing the first image on the canvas; and drawing the first section of the second image on 
the canvas at the determined position of the second segment relative to the first segment so 
that portions of the first section mask out portions of the first image. The method saves 
processing time by simply drawing the previously determined first section on the first image, 
without performing any additional computations. 

Prior to dividing the second image, perspective distortion in the second image is 
corrected to improve the quality of the panoramic image. The second image is divided into 
the first and second section by a dividing line that is determined based on an outline of the 
first image; an outline of the second image; and the relative position of the second image 
segment relative to the first image segment. The dividing line joins two points of intersection 



WO 01/88838 



PCT/US01/40490 



of the outlines of the first and second images when the second image is positioned at the 
determined relative position, e.g., two most distant points of intersection. The first section of 
the second image is determined based on how much of the second image on each side of the 
dividing line is overlapped by the first image. A region around the dividing line where the 
5 second image is mixed with the first image to smooth out the transition between the first 
image and the second image is determined. The dividing line divides the region into a first 
sub-region contained within the first segment of the second image and a second sub-region 
contained within the second segment of the second region. More of the second image is 
mixed in the first sub-region than the second sub-region to provide a smoother transition 
10 between the first and second images. 

The details of one or more embodiments of the invention are set forth in the accompa- 
nying drawings and the description below. Other features, objects, and advantages of the 
invention will be apparent from the description and drawings, and from the claims. 

DESCRIPTION OF DRAWINGS 

15 FIG. 1 is a block diagram of a system for blending images of overlapping segments of 

a view; 

FIG. 2 A shows four exemplary images of overlapping segments of a view; 
FIG. 2B is a panoramic image formed by the system of FIG. 1 by blending the images 
of FIG. 2A; 

20 FIG. 2C shows the images of FIG. 2 A along with positioning information used in 

blending the images; 

FIGs. 2D and 2E show the images of FIG. 2A after they have been aligned using 
positioning information; 

FIG. 2F shows the images of FIG. 2 A after the system of FIG. 1 has corrected them 
25 for perspective distortion; 

FIG. 3 A shows the process used by the system of FIG. 1 to blend images; 
FIG. 3B and 3C show various image outlines used to blend the images of FIG. 2F; 
FIG. 3D is a plot of mask valued used in blending the images of FIG. 2F; 
FIG. 3E shows various image outlines used to blend an image of FIG. 2F; and 
30 FIG. 4 shows various intermediate images drawn when creating the panoramic image. 

Like reference symbols in the various drawings indicate like elements. 
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DETAILED DESCRIPTION 

Referring to FIG. 1, a computer system 10 for blending images 18 has a processor 12 
for executing programs 12, 14 stored within a storage 16. Storage 16 is a computer readable 
medium, such as a CDROM, a hard disk, a hard disk array, a floppy disk or a ROM. The 
computer programs 12, 14 (i.e., the image capture software 12 and the image stitching 
software 14) are loaded into computer readable memory 16 and then executed to process the 
images 18. The computer system 10 is associated with a scanner 20 for converting the 
images 1 8 into a digital format, a computer keyboard 22, and a pointing device 24 for 
capturing input from a user (not shown). The computer system 10 is also associated with a 
monitor 28 for displaying images, and a printer 30 for printing images. The computer system 
10 also includes a network interface 34 for communicating with devices connected to a 
computer network 32. 

The user (not shown) activates the scanner 20 using the keyboard 22 or the pointing 
device 24, causing the scanner to scan and transmit the images 1 8 to the image capture 
software 12. The image capture software 12 is a TWAIN application-programming interface 
(API) that captures the images 18 and conveys them to image stitching software 14. Image 
stitching software 14 blends the images 18 together to form a panoramic image 26 that is 
displayed on the monitor 28 or printed on the printer 30. The panoramic image 26 may also 
be stored within the storage 16 or transmitted to a remote location over the computer network 
32 through the network interface 34. 

Referring to Fig. 2A, images 18 depict overlapping segments of a view that are 
common to all the images. For example, images 18a-18d all depict segments of the front 
view of a house. The first image 1 8a depicts a central segment of the front view and is 
centered about the entrance to the house. The second image 1 8b depicts an upper segment of 
the view to include a balcony 70 on an upper floor of the house, while the third image 1 8c 
depicts a left segment of the front view to include a tree 71 located to the left of the entrance. 
The fourth image 1 8d depicts a right segment of the view to include the window to the right 
of the entrance. 

Referring to FIG. 2B, image stitching software 14 (FIG. 1) blends the images 18a-18d 
to generate a single panoramic image 26 that includes the balcony 70, the tree 71, and the 
window 72. Thus, the image stitching software 14 allows a user to blend multiple images 
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18a-18d to create a panoramic image 26 with a field of view that is larger than the field of 

any one of the multiple images. 

Referring again to FIG. 1 , the positioning module 50 of the image stitching software 

14 determines the relative positions of the segments depicted in two of the images 18a-18d so 
5 that an image of an object depicted in one of the images can be aligned with another image of 

the same object. The positioning module 50 automatically determines the relative 

positioning of the two segments corresponding to the images using known methods described 
• in "Direct Estimation of Displacement Histograms", Proceedings of the OSA Meeting on 

Image Understanding and Machine Vision, June 1989, Bernd Girod & David Kuo ("Girod"), 
10 which is incorporated herein by reference. The software modules are dynamically linked, 

machine language libraries that are obtained by compiling a high level computer 

programming language, such as "C++" or "C". The functions and operations of the different 

software modules will be described below. 

Referring to FIG. 2C, the determination of the position of the relative segments will 
15 be described with reference to the position of the top left corner of the doorway relative to 

the bottom left corner of each of the images 18a-18d. For example, the top left corner of the 

doorway is horizontally displaced from the bottom left corner of the image by a distance xo in 

the first image 18a, while it is displaced by a distance xi in the second image 18a. 

Consequently the second image is displaced to the left of the first image by a distance (d/ e ft) 
20 given by the mathematical equation: 

di eft = X 0 -Xi. 

Similarly, the top left corner of the doorway is vertically displaced from the bottom left 
corner of the image by distance y 0 in the first image 18a, while it is displaced by a distance yi 
in the second image 18a. The second image is, therefore, displaced below the first image by 
25 a distance (d d0 wn) given by the mathematical equation: 

ddown = yo-yi- 

To align the top left corner of the doorway in the first image 18a and the second image 18b, 
the two images are overlapped and the second image is displaced by the distance di e ft to the 
right and displaced by the distance d d0 wn upwards, as shown in Fig. 2D. The other images 
30 1 8c and 18d are also overlapped and displaced in a similar fashion to align the pixel 

representing the top left corner of the doorway in one image to other pixels representing the 
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same comer of the doorway in other images. The result of aligning all the images 18a-18d is 
shown in FIG. 2E. 

However, as shown in FIG. 2E, the overlapping images resulting from the positional 
alignment described above may not seamlessly blend into each other. For example, a seam 
62 is created across a staircase 63 (depicted in the overlapped images) where the two images 
1 8c and 1 8d join each other. Consequently, additional processing is required to blend the 
images into each other and create the near-seamless panoramic image 26 (FIG. 2B). The 
perspective corrector 52 and the other modules 54-58 perform the additional steps, as 
described below. 

To reduce seams 62 (FIG. 2E) in the blended image, the perspective corrector 52 
corrects perspective distortions within the images using known methods described in "Virtual 
Bellows", Proceedings of IEEE International Conference on Image Processing, Nov. 1994, 
Steven Mann & R.W. Picard"), which is incorporated herein by reference. The perspective 
of each of the original images 18b-18d (FIG. 2A) is corrected relative to the first image 18a 
by either enlarging one side of the images 18b-18d corresponding to more distant objects 
and/or shrinking another side of the images 18b-l 8d corresponding to closer objects. The 
perspective correction yields trapezoidal second 18b% third 18c', and fourth images 18d' 
(shown in FIG. 2F). Aligning the trapezoidal images results in smaller seams 62 (FIG. 2E) 
because the objects in the images do not have distorted sizes. 

Referring to FIGs. 3A-3C, a process for blending images implemented by the 
computer system of FIG. 1 will be described using the images 18a-18d as examples. The 
process begins when the image capture software 12 (FIG. 1) captures (200) the images 18 
(FIG. 1) that are to be blended. The positioning module 50 (FIG. 1) determines (202) the 
position of the segment of the view corresponding to the each image 18b-18d relative to the 
segment of the view corresponding to the first image 18a (as previously described with 
reference to FIGs. 2C and 2D), and the perspective corrector 52 corrects (204) perspective 
distortion in each image 18b-18d relative to the reference image 18a (as previously described 
with reference to FIG. 2F). The stitching software 14 (FIG. 1) then sets (206) a visible 
property of the pixels of all the images to indicate that all the pixels of all the images start of 
being visible. The stitching software then sets (208) the current image to be the first image 
and proceeds to determine the visible area of each of the images 18a, 18b'-18d' as described 
below. 
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The stitching software 14 sets (210) the current image to be the next image 18b' after 
the current image 1 8a and sets the reference image to be the first image 1 8a . Thereby 
leaving all the pixels 72 of the first image 18a visible (indicated by hash marks in FIG. 3B). 
Although all the pixels of the first image are set visible, some of the pixels of the first image 
may be obstructed or masked out by visible portions of subsequent images, as described later. 

The dividing-line determiner 54 (FIG. 1) determines (212) an outline 74 (FIG. 3C) of 
a panoramic image formed by aligning the current image 18b 5 and the reference image 1 8a 
(as previously described with reference to FIG. 2D). The dividing-line determiner 54 also 
determines a pair of points 76, 78 where the outlines of the aligned images intersect , thereby 
defining (214) a line 80 that divides (216) the panoramic outline 74 into two sections 82, 84 
(216). If the outlines of the aligned images intersect at more than two points, the dividing- 
line determiner 54 selects the two intersection points that are furthest apart from each other to 
define the dividing line 80. The dividing-line determiner 54 then determines (218) which 
one of the two sections 82, 84 has less of the current image 18b 5 that is not overlapped by the 
reference image 1 8a and sets (220) that section of the current image to be invisible. In the 
example of FIG. 3C, the section is 84 has none of the current image profile 73 that is not 
overlapped by the first image 18a. Consequently, the portions of the image profile 85 
contained within the section 84 are set invisible, leaving the hashed section 82 of the image 
18b visible. 

The stitching software 14 checks (222) whether there are any more images between 
the reference image 18a and the current image 18b'. If there are more images, the stitching 
software 14 sets (224) the reference image to be the next image after the current reference 
image and repeats the process of setting a section of the current image 18b 5 invisible 208-220 
described above. Otherwise if there are no more images, the blending mask determiner 56 
(FIG. 1) determines (226) the pixels within the current image that will mask out pixels of 
earlier images. Only visible pixels 82 of the current image 18b 9 mask out pixels of earlier 
images. Consequently, the mask value of pixels contained within the region 82 is set to "1", 
while the mask property of pixels contained within the region 84 is set to "0". 

The blending mask determiner smoothes the intersection between the region 82 with 
pixel values set to 1 and the region 84 with pixel values set to 0 by applying a method 
described in "A Multiresolution Spline With Application to Image Mosaics", ACM 
Transactions on Graphics, Vol 2, No. 4, October 1983, P.J. Burt & E.H. Adelson, which is 
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incorporated herein by reference. Referring to the close-up 100 of FIG. 3C, the smoothing 
establishes a transition band 104 within the invisible section 84 and next to the dividing line 
80 where the mask value transitions smoothly from a value of "1" at the dividing line to a 
value of "0", thereby eliminating sharp discontinuities in the panoramic images at the 
dividing line 80 where blended images 18a, 18b' intersect, as will be described later. 

As shown in FIG. 3D, the mask value is "1" within the visible region 82. The 
smoothing function causes the mask value to reduce smoothly within the transition band 104 
to a value of "0" within the invisible region 84. 

Referring again to FIG. 3A, after determining the mask values of the image, the 
stitching software 14 checks (228) whether there are any images after the current images. If 
there are more images, the stitching software sets (210) a new current image to be the next 
image after the current image and proceeds to determine the mask values of the new current 
image (212-226). 

Based on the discussion above, the processing of subsequent images 1 8c' and 1 8d' 
can be inferred. For example, referring to FIG. 3E, it will be appreciated that the visible area 
of the third image 18c' will be set to the interior of an outline 87 at 206, and that when the 
reference image is the first image 1 8a, the visible area will be reduced at 220 to the interior 
of a smaller outline 86. Subsequent to that, when the reference image is set to the second 
image 18b' the visible area will be further reduced to an even smaller outline 90. 

Referring again to FIG. 3 A, if there are no more images after the current image, the 
image blender 58 overlaps the images 18a, 18b'-18d' based on the masking value to create 
the panoramic image (230). 

Referring to FIG. 4, the image blender starts with a clean background, known as a 
canvas 120, onto which it draws the first image 1 8a to produce an image 120a, after which 
the image blender draws the visible portion 121b of the second image 18b' onto the canvas 
120 to produce the image 120b. In drawing the second image, the image blender computes 
the pixel values of the image 120b according to the formula: 

Pixel panoramic = pixel seC ond image *mask_value + pixel fi rs t image * ( 1 -mask_value) 

Where: 

pixel second image is the value of a pixel of the second image; 

pixel erst image is the value of a pixel of the first image that is at the same position as the 
pixel of the second image; and 
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mask value is the value of the mask of the pixel of the second image. 
As can be seen from the formula above, where the pixel value of the second image has a 
value of "1", the second image completely obstructs the first image and where the pixel value 
of the second image has a value of "0", the first image is completely visible through the 
5 second image. However, when the mask value is between "0" and "1", the image blender 
mixes the first and the second image, thereby smoothing the transition from one image to 
another. 

After drawing the second image, the image blender draws the visible portion 121c of 
the image 18c' to produce the image 120c. Finally, the image blender draws the visible 

10 portion 121d of the image 18d' to produce the panoramic image 26 of FIG. 2B. 

From the discussion above, it should be clear that the mask values of each image only 
depend on the images before it. Consequently, the mask value of an earlier image does not 
need to be recomputed when a newer image is removed or added. This saves on computing 
time, resulting in a shorter response time. For example, when a user commands the stitching 

15 software 14 to add a new image, the stitching software computes the mask of the new image 
relative to the four images 18a, 18b'-18d\ The stitching software then draws the visible 
portion of the new image over the canvas 120, thereby obstructing a portion of at least one of 
the previously drawn images 16a, 18b'-18d\ If the user later commands the software to 
remove the new image, the stitching software erases the canvas 120 and draws the visible 

20 portions of the 1 8a, 1 8b'-l 8d' images in sequence based on the previously computed image 
masks, as previously described. 

A number of embodiments of the invention have been described. Nevertheless, it will 
be understood that various modifications may be made without departing from the spirit and 
scope of the invention. For example, the image 18 to be blended may be obtained form a 

25 digital camera, storage 16, or a network 26. The positioning module may determine the 
relative positions of segments depicted in two images by prompting the user to use the 
pointing device 24 to click on an object, such as the top left corner of the doorway, that is 
depicted in both of the images and determining the relative positions based on the positions 
that the user clicks on. 

30 Accordingly, other embodiments are within the scope of the following claims. 
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WHAT IS CLAIMED IS: 

1 . A method of merging images of segments of a view, comprising: 

determining the position of a second segment of the view represented by a second 
image relative to a first segment of the view represented by a first image; 

dividing the second image into a first section and a second section, based on the 
determined positions; 

drawing the first image on a canvas; and 

drawing the first section of the second image on the canvas at the determined position 
so that a portion of the first section masks out a portion of the first image. 

2. The method of claim 1, further comprising: 

determining a position of a third segment of the view, represented by a third image, 
relative to the first segment; 

dividing the third image into a third section and a second section, based on the 
determined position relative to the first segment; 

determining a position of the third segment of the view relative to the second image; 

dividing the third section into a fifth and a sixth section, based on the determined 
position relative to the second image; and 

drawing the fifth section of the third image on the canvas at the determined position 
relative to the third image so that a portion of the fifth section obstructs at least one of the 
first image and the first section of the second image. 

3. The method of claim 2, further comprising responding to a command to remove the third 
image by: 

erasing the canvas; 

drawing the first image on the canvas; and 

drawing the first section of the second image on the canvas at the determined position 
of the second segment relative to the first segment so that portions of the first section 
mask out portions of the first image. 
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4. The method of claim 1, further comprising: 

prior to dividing the second image, correcting perspective distortion in the second 
image. 

5. The method of claim 1, wherein the second image is divided into the first and second 
section by a dividing line that is produced based on: 

an outline of the first image; 

an outline of the second image; and 

the relative position of the second segment relative to the first segment. 

6. The method of claim 5, wherein the dividing line joins two points of intersection of the 
outlines of the first and second images when the second image is positioned at the 
determined relative position. 

7. The method of claim 6, wherein the dividing line joins the two intersection points of the 
outlines of the first and second images that are most distant from each other. 

8. The method of claim 5 5 wherein the first section of the second image is determined based 
on how much of the second image on each side of the dividing line is overlapped by the 
first image. 

9. The method of claim 5, further comprising: 

determining a region around the dividing line where the second image is mixed with 
the first image to smooth out the transition between the first image and the second image. 

10. The method of claim 8, wherein the dividing line divides the region into a first sub-region 
contained within the first segment of the second image and a second sub-region contained 
within the second segment of the second region, more of the second image being mixed 
in the first sub-region than the second sub-region. 

1 1 . An article comprising a computer-readable medium which stores computer-executable 
instructions for merging images of segments of a view, the instructions causing a 
computer to: 

determine the position of a second segment of the view represented by a second 
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image relative to a first segment of the view represented by a first image; 

divide the second image into a first section and a second section, based on the 
determined positions; 

draw the first image on a canvas; and 
5 draw the first section of the second image on the canvas at the determined position so 

that a portion of the first section masks out a portion of the first image. 

12. The article of claim 11, wherein the instructions further cause the computer to: 

determine a position of a third segment of the view, represented by a third image, 
relative to the first segment; 
10 divide the third image into a third section and a second section, based on the 

determined position relative to the first segment; 

determine a position of the third segment of the view relative to the second image; 

divide the third section into a fifth and a sixth section, based on the determined 
position relative to the second image; and 
15 draw the fifth section of the third image on the canvas at the determined position 

relative to the third image so that a portion of the fifth section obstructs at least one of the 
first image and the first section of the second image. 

13. The article of claim 12, wherein the instructions further cause the computer to 

respond to a command to remove the third image by: 
20 erasing the canvas; 

drawing the first image on the canvas; and 

drawing the first section of the second image on the canvas at the determined position 
of the second segment relative to the first segment so that portions of the first section 
mask out portions of the first image. 



25 



14. The article of claim 1 1, wherein the instructions further cause the computer to: 

correct perspective distortion in the second image, prior to dividing the second image,. 

15. The article of claim 11, wherein the instructions cause the computer to divide the second 
image into the first and second section by a dividing line that is produced based on: 

an outline of the first image; 
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an outline of the second image; and 

the relative position of the second segment relative to the first segment. 

16. The article of claim 15, wherein the dividing line joins two points of intersection of the 
outlines of the first and second images when the second image is positioned at the 
determined relative position. 

17. The article of claim 16, wherein the dividing line joins the two intersection points of the 
outlines of the first and second images that are most distant from each other. 

18. The article of claim 15, wherein the computer instructions cause the computer to 
determine first section of the second image based on how much of the second image on 
each side of the dividing line is overlapped by the first image. 

19. The article of claim 15, wherein the computer instructions further cause the computer to: 

determine a region around the dividing line where the second image is mixed with the 
first image to smooth out the transition between the first image and the second image. 

20. The article of claim 18, wherein the dividing line divides the region into a first sub-region 
contained within the first segment of the second image and a second sub-region contained 
within the second segment of the second region, the computer instructions causing the 
computer to mix more of the second image in the first sub-region than the second sub- 
region. 
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